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A free software for digital imaging and communications in medicine 
(DICOM) image anonymization is needed to protect patient information 
from third parties. This study aimed to develop a software for the 
anonymization of patient information in computed tomography (CT) image. 
There was a total of 17 informations to be anonymized, such as Patient’s 
Name, Other Patient Names, Patient’s ID, Patient’s Birth Date, Patient’s 
Sex, Patient’s Age, Study ID, Study Description, Series Description, 
Institution Name, Institution Adress, Referring Phisician’s Name, 
Consulting Physician’s Name, Performing Physician’s Name, Name of 
Physician(s) Reading Study, Operator’s Name, and Protocol Name. In every 
information, its initial value was replaced with a dummy value with the 
string value of "N/A". For testing, patient CT images from four different 
hospitals with different scanners were collected. It is found that each scanner 
had different information stored in DICOM information. However, the 


anonymization process in the four hospitals works well with accuracy of 
100%. The developed software can anonymize DICOM images flexibly and 
successfully. This software can be used for anonymization of patient 
information in order to protect patient information. 
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1. INTRODUCTION 

Computed tomography (CT) is a main medical imaging modality used in health care system [1]. CT 
produces image displaying map of coefficient attenuation of the tissue inside the patient [2]. CT images are 
reconstructed from sinogram data into axial images [3], [4]. The following axial images can be distributed in 
the picture archiving and communication system (PACS) for diagnosis, communication and storage purposes 
[5], making them easy to share with various medical devices in the network within healthcare facilities [6]. 

CT images are stored in certain standardized formats, namely the digital imaging and 
communications in medicine (DICOM) [7]-[11]. On a single examination, CT reconstructs multiple DICOM 
images at once. Number of images depends on the scan length, slice thickness, slice interval, and pitch. In 
addition to containing image details, DICOM also contains information about examinations, patient 
information, institutional information, and others called as DICOM tags [9]. The standardization of DICOM 
makes it widely used, not only for clinical purposes, but also for education and research purposes [12], [13]. 
However, sharing or exchanging confidential information about patients requires data protection to ensure 
privacy and security [14]. Leakage of important information can result in privacy violations, as well as 
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intrusion into the system [15], [16]. Therefore, tools are needed to protect sensitive information, one of which 
is the anonymization of DICOM [17]-[20]. 

While several tools are available for de-identifying DICOM data, including those integrated into 
DICOM Viewer software [21]-[24], many of these tools are commercial programs and may not be accessible 
to everyone. Additionally, these tools may not always work effectively in anonymizing specific data [7], 
highlighting the need for a freely available and user-friendly software for DICOM anonymization. Therefore, 
this study aims to develop a free software with a simple graphical user interface (GUI) for anonymization of 
patient information stored in CT image. The developed software will be a valuable too for medical staff 
seeking to protect patient privacy. 


2. METHOD 
2.1. DICOM standard 

DICOM is a standard protocol for the medical image communication system. This includes 
information and transmission management to facilitate communication and diagnostics that are widely used 
in many healthcare facilities [25]. Because of its digital nature, this can be a substitute for film media for 
image storage and reading purposes. Initially, DICOM is introduced by the National Electrical Manufacturers 
Association (NEMA) and the American College of Radiology (ACR). DICOM is a trademark that is 
recognized and used as a standard for broad medical imaging communication and various types of modalities. 

The DICOM image contains not only data pixels, but also almost all examination information. This 
includes information on the modality used, scan parameters, patient information, and institutional 
information. This information is stored in a single file as metadata that can be accessed using tags [20]. Most 
tags have keywords to get specific information and its value. NEMA as the initiator of the DICOM system 
applies a dictionary structure to sort the DICOM tags. For example, Table 1 shows some of the tags used in 
DICOM. The tags function is to sort information by category of information. Each tag has a standard name, 
and a keyword to access its value. The value can also be accessed directly using tags. 


Table 1. Examples of some tags used by DICOM 


Tag Name Keyword Value 
(0010,0010) Patient’s Name PatientName String 
(0018,1151) X-Ray Tube Current XRayTubeCurrent Integer 
(0008,0022) Acquisition Date AcquisitionDate String 


In this study, the DICOM anonymization software was written using the Python programming 
language and has been integrated into the IndoQCT [26]. DICOM files were accessed using the Pydicom 
module package [27], [28]. Using this module, DICOM files were read and stored in memory as a dataset. 
The dataset is the main object of an image that is processed for various operations, both related to image 
processing and dosimetry measurements [29]. The dataset contains data elements that store DICOM image 
information. This includes those discussed in the previous section, namely tags, names, and values. To access 
data elements, specific tags or keywords can be used in the dataset as attributes. For example, (1) displays a 
command to call the data element of the patient's name. While (2) gives the command to call the value of the 
data element of the patient's name. 


ds[‘PatientName’ ] (1) 
ds.PatientName (2) 


where ds is the dataset of the DICOM images that have been read. To change the value of the specific data 
element, someone can assign a replacement value on the value attribute of the specific data elements as in (3). 


ds.[‘PatientName’ ].value = ‘new name’ (3) 


Anonymization performs value replacement on some sensitive data elements in a dataset using a 
new anonymous value. By default, we set it to use a dummy value of “not applicable” or N/A. The 
assignment of dummy values was repeated according to the list of keywords that can be seen in Table 2. 

Figure 1 displays the GUI for anonymizing patients’ information. The form will be enabled if the 
data element was detected with the keywords shown in Table 3. This adjustment was added because not 
every DICOM file contains the intended data element, even though it has been standardized. The data 
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element contained in the file was vendor-specific. After the image was anonymized, it can be saved as a new 
file in the desired directory and file name. In the developed software, an option for anonymizing all slices 


was provided. 


Table 2. List of anonymized data elements 


Tag Name Keyword 
(0010,0010) Patient’s Name PatientName 
(0010,1001) Other Patient Names OtherPatientNames 
(0010,0020) Patient ID PatientID 
(0010,0030) Patient's Birth Date PatientBirthDate 
(0010,0040) Patient's Sex PatientSex 
(0010,1010) Patient’s Age PatientAge 
(0020,0010) Study ID StudyID 
(0008,1030) Study Description StudyDescription 
(0008,103E) Series Description SeriesDescription 
(0008,0080) Institution Name InstitutionName 
(0008,0081) Institution Address Institution Address 
(0008,0090) Referring Physician's Name ReferringPhysicianName 
(0008,009C) Consulting Physician's Name ConsultingPhysicianName 


(0008,1050) 
(0008,1060) 
(0008,1070) 
(0018,1030) 


Performing Physician's Name 


Name of Physician(s) Reading Study 


Operator’s Name 
Protocol Name 


PerformingPhysicianName 
NameOfPhysiciansReadingStudy 
OperatorsName 
ProtocolName 


Q Anonymize DICOM 


Tags 


M Patient name 


Other patient names 


EBs 
@ 
© Patient ID 

© Patient birth date 
© Patient sex 

M Patient age 

© Study 1D 

M Study description 
E] Series description 
© Institution name 
© Institution address 


© Referring physician name 
Consulting physician name 
© Performing physician name 
Physician reading study name 
| Operators name 


M Protocol name 


Not available @ 


[Not avaiable S| @ 


— O 


Option 


Figure 1. Graphical user interface (GUI) for anonymization 
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Table 3. Types of CT scanners used for testing the developed software 


Hospital Scanner Protocol 
Hospital A Siemens Emotion 6 Abdomen 
Hospital B Siemens Somatom Definition AS Thorax with contrast 
Hospital C Toshiba Aquillion Chest 
Hospital D Hitachi ECLOS Abdomen 


2.2. Dataset 

Four datasets of clinical images from four different hospitals with different types of CT scanners 
were used for testing the developed software. The type of scanner can be seen in Table 3. Both files (before 
and after anonymization) were then opened using the Microdicom Viewer to see the comparison. 


3. RESULTS AND DISCUSSION 
3.1. Comparison using microdicom viewer 

Figures 2 to 5 show the original images with the patient and institutional information (blurred) and 
the anonymized images of hospital A as presented in Figure 2, hospital B as shown in Figure 3, hospital C as 
presented in Figure 4, and hospital D as shown in Figure 5, respectively. It can be seen that the original 
image contains patient and hospital information which is displayed on the viewer screen by default. 
However, after the anonymization process was carried out, the information was replaced with a new value, 
namely "N/A". The results can be seen directly in the anonymized figure so that patient and hospital 
information does not appear. From four hospitals, the anonymization process works well. 


Original Anonymized 


Figure 2. Anonymized DICOM image at hospital A 


Original Anonymized 


Figure 3. Anonymized DICOM image at hospital B 
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IST: 2.00 SL: -166.00 


Original Anonymized 


Figure 4. Anonymized DICOM image at hospital C 
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Figure 5. Anonymized DICOM image at hospital D 


3.2. Data elements anonymization 

Table 4 shows the anonymization results per data element. It can be seen that confidential information 
can be properly anonymized if it is available. Of the four hospitals, the information contained in DICOM is 
different. This depends on the type of scanner and settings. Some hospitals provide a tag whose value is empty. 

The DICOM anonymization program is important to secure sensitive information, both information 
about patients and about institutions [30]. The aim of this study is to develop a DICOM anonymization 
program that can be run on various types of CT scanners in various hospitals. The success rate of the program 
is tested by running anonymization on many predefined tags. 

Table 4 indicates that all the information contained in the DICOM file can be accessed and its value 
can be changed with the dummy value "N/A" properly. The availability of data elements varies depending on 
the type of scanner. Not all CT scanners provide the same specific information as other scanners (e.g. 
SpiralPitchFactor) even though they are standardized. The PerformingPhysicianName tag in Table 4 
indicates the different availability. However, the developed software has only been tested using only CT 
images because the parent program (i.e. IndoQCT) is a program whose main purpose is to evaluate the 
quality of CT images [26]. Images from other modalities still need further investigation. 

Several limitations exist in this study. The element data included for anonymization is still very 
limited and does not meet the standards of NEMA (Sup 142) [31]. The anonymization process is also still 
limited to replacing the value with a new value in accordance with the purpose of anonymization. In Sup 142, 
there is a recommended protocol for the anonymization of each data element. For example, PatientName is 
recommended to be cleaned completely, or replaced with a new value according to anonymization purposes. 
PatientBirthDate is recommended to be replaced with a zero-length value, and PatientAge can be used to 
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replace PatientBirthDate if needed for educational purposes. Dummy values can be substitutes for data 
elements that point to institutions such as InstitutionName or InstitutionAddress. Private tags also need to be 
considered because they may contain patient information, as well as important exposure information that 
should be protected. Hence, the development of our program still needs further improvements. 


Table 4. Anonymization results on various hospitals 


Name Hospital A HospitalB Hospital C__ Hospital D 
Patient’s Name S S S S 
Other Patient Names S NA NA NA 
Patient ID S S S S 
Patient's Birth Date S S S S 
Patient's Sex S S S S 
Patient’s Age S S S S 
Study ID S S S S 
Study Description S S NA S 
Series Description S S S S 
Institution Name S S S S 
Institution Address S S NA NA 
Referring Physician's Name S S S S 
Consulting Physician's Name NA NA NA S 
Performing Physician's Name NA S NA S 
Name of Physician(s) Reading Study NA NA NA S 
Operators' Name S S NA S 
Protocol Name S S S S 


* Abbreviation: S = Success; NA = Not available 


There are also other limitations regarding the burnt-in information in the image pixels [14], [32], 
[33]. This includes the dose report image, as well as the tomogram image. In this case, it is necessary to 
change the image on the actual pixels. If it is not, then anonymization at the metadata level will lose its 
purpose [21]. Sensitive information in dose reports and DICOM tags can be a way for third parties to 
infiltrate, both online and local systems [34], [35]. Therefore, it is important for the anonymization process to 
be enforced according to standards. 

Anonymization programs perform their tasks locally or offline. This means all processes are 
executed after the DICOM image is exported to the user's device. To ensure the protection of patient 
information, medical staff must monitor the anonymization of the user's device from third parties before 
handing over anonymized DICOM to them. In other cases, medical staff may access DICOM directly from 
the PACS server to the user's device. It's certainly a concern to organize the original and anonymized files 
separately. Another solution can be run, namely by anonymizing DICOM on the direct export feature of 
PACS [16]. This will be carried out in our future work. 


4. CONCLUSION 

Software to anonymize patient’s information stored CT images has been successfully developed. A 
total of 17 information can be anonymized by a dummy value. Although each information stored in each 
vendor is different, the developed software can handle these differences well. With this developed software, 
patient data can be anonymized. This is very useful for various purposes, for example for research purposes 
in order to maintain the confidentiality of patient. 
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